Modeling Auditory Perception for Robust Speech Recognition
نویسندگان
چکیده
Forward masking stimuli: (A) Large timescale view of a single 2AFC trial; (B) Fourier Transform of the probe signal (128 ms rectangular window); (C) Smaller timescale view of the probe following the masker by 15 ms.. Average forward masking data (circles), and std. dev. (error bars), together with the model fit (lines) as a function of masker level across 5 octaves, with probe delays of 15, 30, 60, and 120 ms as a parameter.. Average forward masking data at 1kHz: (a) as a function of the log delay with masker level as a parameter; and (b) as the dynamic range below masker as a function of the masker level with probe delay as a parameter. The dotted line reflects the probe threshold in quiet.. (A) A prototypical I/O curve for a single channel in the dynamic model; and, schematic output trajectories corresponding to a level change at three different rates for (B) decreasing inputs, and (C) increasing inputs.. Geometry to derive attack (downward adaptation) parameters from forward masking thresholds as a function of masker duration.. viii The model's prediction of the decay of forward masking as a function of masker level at 1 kHz.. Adaptation to, and recovery after, a pulse: (A) The response to the second pulse is diminished; and (B) Impulses, corresponding to onsets, are initially masked (similar to figures in [Goldhor 1985]).. Using the model to predict other forward masking data: (A) wide-band masker and probe [Plomp 1964]; (B) wide-band masker, sinusoidal probe at 1kHz [Moore and Glasberg 1983]; (C) sinusoidal masker and probe at 1kHz [Jesteadt et al. 1982]. (D) The equation provided in [Jesteadt et al. 1982] predicting the present data.. Peak isolation processing: log spectrum of the vowel [i] after (A) cepstral truncation; (B) raised-sine cepstral liftering; and (C) half-wave rectification.. Comparisons of the amplitude modulation detection models. Dashed lines indicate standard deviations. The modulation filtering plots show the outputs of six auditory channels, each filtered by a modulation filter centered at 100 Hz..
منابع مشابه
Correlation between Auditory Spectral Resolution and Speech Perception in Children with Cochlear Implants
Background: Variability in speech performance is a major concern for children with cochlear implants (CIs). Spectral resolution is an important acoustic component in speech perception. Considerable variability and limitations of spectral resolution in children with CIs may lead to individual differences in speech performance. The aim of this study was to assess the correlation between auditory ...
متن کاملAn Information-Theoretic Discussion of Convolutional Bottleneck Features for Robust Speech Recognition
Convolutional Neural Networks (CNNs) have been shown their performance in speech recognition systems for extracting features, and also acoustic modeling. In addition, CNNs have been used for robust speech recognition and competitive results have been reported. Convolutive Bottleneck Network (CBN) is a kind of CNNs which has a bottleneck layer among its fully connected layers. The bottleneck fea...
متن کاملEffect of signal to noise ratio on the speech perception ability of older adults
Background: Speech perception ability depends on auditory and extra-auditory elements. The signal-to-noise ratio (SNR) is an extra-auditory element that has an effect on the ability to normally follow speech and maintain a conversation. Speech in noise perception difficulty is a common complaint of the elderly. In this study, the importance of SNR magnitude as an extra-auditory effect on speech...
متن کاملسایکوآکوستیک و درک گفتار در افراد مبتلا به نوروپاتی شنوایی و افراد طبیعی
Background: The main result of hearing impairment is reduction of speech perception. Patient with auditory neuropathy can hear but they can not understand. Their difficulties have been traced to timing related deficits, revealing the importance of the neural encoding of timing cues for understanding speech. Objective: In the present study psychoacoustic perception (minimal noticeable differen...
متن کاملRobust automatic speech recognition and modeling of auditory discrimination experiments with auditory spectro-temporal features
متن کامل
Stochastic perceptual auditory-event-based models for speech recognition
We have developed a statistical model of speech that incorporates certain temporal properties of human speech perception. The primary goal of this work is to avoid a number of current constraining assumptions for statistical speech recognition systems, particularly the model of speech as a sequence of stationary segments consisting of uncorrelated acoustic vectors. A focus on perceptual models ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1998